{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# For tips on running notebooks in Google Colab, see\n",
    "# https://pytorch.org/tutorials/beginner/colab\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Learn the Basics](intro.html) \\|\\|\n",
    "[Quickstart](quickstart_tutorial.html) \\|\\|\n",
    "[Tensors](tensorqs_tutorial.html) \\|\\| [Datasets &\n",
    "DataLoaders](data_tutorial.html) \\|\\|\n",
    "[Transforms](transforms_tutorial.html) \\|\\| **Build Model** \\|\\|\n",
    "[Autograd](autogradqs_tutorial.html) \\|\\|\n",
    "[Optimization](optimization_tutorial.html) \\|\\| [Save & Load\n",
    "Model](saveloadrun_tutorial.html)\n",
    "\n",
    "Build the Neural Network\n",
    "========================\n",
    "\n",
    "Neural networks comprise of layers/modules that perform operations on\n",
    "data. The [torch.nn](https://pytorch.org/docs/stable/nn.html) namespace\n",
    "provides all the building blocks you need to build your own neural\n",
    "network. Every module in PyTorch subclasses the\n",
    "[nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html).\n",
    "A neural network is a module itself that consists of other modules\n",
    "(layers). This nested structure allows for building and managing complex\n",
    "architectures easily.\n",
    "\n",
    "In the following sections, we\\'ll build a neural network to classify\n",
    "images in the FashionMNIST dataset.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import torch\n",
    "from torch import nn\n",
    "from torch.utils.data import DataLoader\n",
    "from torchvision import datasets, transforms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Get Device for Training\n",
    "=======================\n",
    "\n",
    "We want to be able to train our model on a hardware accelerator like the\n",
    "GPU or MPS, if available. Let\\'s check to see if\n",
    "[torch.cuda](https://pytorch.org/docs/stable/notes/cuda.html) or\n",
    "[torch.backends.mps](https://pytorch.org/docs/stable/notes/mps.html) are\n",
    "available, otherwise we use the CPU.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "device = (\n",
    "    \"cuda\"\n",
    "    if torch.cuda.is_available()\n",
    "    else \"mps\"\n",
    "    if torch.backends.mps.is_available()\n",
    "    else \"cpu\"\n",
    ")\n",
    "print(f\"Using {device} device\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the Class\n",
    "================\n",
    "\n",
    "We define our neural network by subclassing `nn.Module`, and initialize\n",
    "the neural network layers in `__init__`. Every `nn.Module` subclass\n",
    "implements the operations on input data in the `forward` method.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "class NeuralNetwork(nn.Module):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.flatten = nn.Flatten()\n",
    "        self.linear_relu_stack = nn.Sequential(\n",
    "            nn.Linear(28*28, 512),\n",
    "            nn.ReLU(),\n",
    "            nn.Linear(512, 512),\n",
    "            nn.ReLU(),\n",
    "            nn.Linear(512, 10),\n",
    "        )\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.flatten(x)\n",
    "        logits = self.linear_relu_stack(x)\n",
    "        return logits"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create an instance of `NeuralNetwork`, and move it to the `device`,\n",
    "and print its structure.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "model = NeuralNetwork().to(device)\n",
    "print(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To use the model, we pass it the input data. This executes the model\\'s\n",
    "`forward`, along with some [background\n",
    "operations](https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866).\n",
    "Do not call `model.forward()` directly!\n",
    "\n",
    "Calling the model on the input returns a 2-dimensional tensor with dim=0\n",
    "corresponding to each output of 10 raw predicted values for each class,\n",
    "and dim=1 corresponding to the individual values of each output. We get\n",
    "the prediction probabilities by passing it through an instance of the\n",
    "`nn.Softmax` module.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "X = torch.rand(1, 28, 28, device=device)\n",
    "logits = model(X)\n",
    "pred_probab = nn.Softmax(dim=1)(logits)\n",
    "y_pred = pred_probab.argmax(1)\n",
    "print(f\"Predicted class: {y_pred}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------------------------------------------------------------------------\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Model Layers\n",
    "============\n",
    "\n",
    "Let\\'s break down the layers in the FashionMNIST model. To illustrate\n",
    "it, we will take a sample minibatch of 3 images of size 28x28 and see\n",
    "what happens to it as we pass it through the network.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "input_image = torch.rand(3,28,28)\n",
    "print(input_image.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Flatten\n",
    "==========\n",
    "\n",
    "We initialize the\n",
    "[nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html)\n",
    "layer to convert each 2D 28x28 image into a contiguous array of 784\n",
    "pixel values ( the minibatch dimension (at dim=0) is maintained).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "flatten = nn.Flatten()\n",
    "flat_image = flatten(input_image)\n",
    "print(flat_image.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Linear\n",
    "=========\n",
    "\n",
    "The [linear\n",
    "layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)\n",
    "is a module that applies a linear transformation on the input using its\n",
    "stored weights and biases.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "layer1 = nn.Linear(in_features=28*28, out_features=20)\n",
    "hidden1 = layer1(flat_image)\n",
    "print(hidden1.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.ReLU\n",
    "=======\n",
    "\n",
    "Non-linear activations are what create the complex mappings between the\n",
    "model\\'s inputs and outputs. They are applied after linear\n",
    "transformations to introduce *nonlinearity*, helping neural networks\n",
    "learn a wide variety of phenomena.\n",
    "\n",
    "In this model, we use\n",
    "[nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)\n",
    "between our linear layers, but there\\'s other activations to introduce\n",
    "non-linearity in your model.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print(f\"Before ReLU: {hidden1}\\n\\n\")\n",
    "hidden1 = nn.ReLU()(hidden1)\n",
    "print(f\"After ReLU: {hidden1}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Sequential\n",
    "=============\n",
    "\n",
    "[nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html)\n",
    "is an ordered container of modules. The data is passed through all the\n",
    "modules in the same order as defined. You can use sequential containers\n",
    "to put together a quick network like `seq_modules`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "seq_modules = nn.Sequential(\n",
    "    flatten,\n",
    "    layer1,\n",
    "    nn.ReLU(),\n",
    "    nn.Linear(20, 10)\n",
    ")\n",
    "input_image = torch.rand(3,28,28)\n",
    "logits = seq_modules(input_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Softmax\n",
    "==========\n",
    "\n",
    "The last linear layer of the neural network returns [logits]{.title-ref}\n",
    "- raw values in \\[-infty, infty\\] - which are passed to the\n",
    "[nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html)\n",
    "module. The logits are scaled to values \\[0, 1\\] representing the\n",
    "model\\'s predicted probabilities for each class. `dim` parameter\n",
    "indicates the dimension along which the values must sum to 1.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "softmax = nn.Softmax(dim=1)\n",
    "pred_probab = softmax(logits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Model Parameters\n",
    "================\n",
    "\n",
    "Many layers inside a neural network are *parameterized*, i.e. have\n",
    "associated weights and biases that are optimized during training.\n",
    "Subclassing `nn.Module` automatically tracks all fields defined inside\n",
    "your model object, and makes all parameters accessible using your\n",
    "model\\'s `parameters()` or `named_parameters()` methods.\n",
    "\n",
    "In this example, we iterate over each parameter, and print its size and\n",
    "a preview of its values.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print(f\"Model structure: {model}\\n\\n\")\n",
    "\n",
    "for name, param in model.named_parameters():\n",
    "    print(f\"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------------------------------------------------------------------------\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Further Reading\n",
    "===============\n",
    "\n",
    "-   [torch.nn API](https://pytorch.org/docs/stable/nn.html)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
